Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kad: Change RecordStore trait interface - add results to methods. #3076

Conversation

shamil-gadelshin
Copy link
Contributor

Description

This PR starts an attempt to modify RecordStore trait interface. Currently, some of the operations lack the Result and it reduces the expressiveness of the custom implementations. Here is the discussion.

The initial version of the PR contains a change for the single operation - remove and serves as example with some open questions.

Changes

  • added Result for the remove operation of the RecordStore trait.
  • modified usages
  • modified tests

Links to any relevant issues

Original issue: #3035

Open Questions

There were made four changes for use cases of the remove method:

  • remove_record from Behaviour - it seemed native to just return the result
  • get_record from Behaviour - it seemed overkill to change the returned value to result here
  • inject_event - the system logic doesn't seem to mean returning a result here (or using some other means to propagate an error.
  • poll from the PutValueJob also doesn't seem to mean returning a result here (or using some other means to propagate an error.

Please, verify my assumptions here.

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation Shouldn't be required for this.
  • I have added tests that prove my fix is effective or that my feature works. I modified tests.
  • A changelog entry has been made in the appropriate crates

Commit message body

Change the interface of the RecordStore trait. Add missed results for its operations.

- adding Result to the `remove` operation for the RecordStore trait.
@@ -131,8 +131,10 @@ impl<'a> RecordStore<'a> for MemoryStore {
Ok(())
}

fn remove(&'a mut self, k: &Key) {
fn remove(&'a mut self, k: &Key) -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain in more detail, what benefits you are expecting to see from this?

Removing an element should be an idempotent operation in my opinion. What failure scenarios would you like to express here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current RecordStore implies the MemoryStore implementation that could be considered infallible which is not the case for most other custom implementations. Custom stores implemented with DB or file layers are always fallible for multiple reasons.

Just an example of a possible error for the remove operation: "No write permissions for the database file".

Also, the remove operation is just an example of the whole RecordStore API. Specifically, the most problems I have are with the records operation. My database returns a Result for such iterations and I needed to create a special empty iterator with some error logging which is not a good practice IMHO.

The original issue contains a similar concern from other users: #3035 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that DB or file system APIs are fallible. But what is the NetworkBehaviour supposed to do with the error? Log it? If there is no reasonable error handling strategy, we might as well not allow the implementation to return one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An implementation of RecordStore that interacts with a DB could also just represent a Handle that sends commands to the actual implementation, similar to how SQL DBs usually write to a journal first before modifying data on disk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that DB or file system APIs are fallible. But what is the NetworkBehaviour supposed to do with the error? Log it? If there is no reasonable error handling strategy, we might as well not allow the implementation to return one.

That depends on the use case and libp2p implementation (architecture). Some use cases are compatible with the Result already, for example, remove_record from Behaviour will return it to the caller. Some with the current code are hard to work with and logging seems the easiest change. I agree, that it will add complexity to the design, however, IMHO it makes the API more idiomatic and reflects a broader set of use cases.

An implementation of RecordStore that interacts with a DB could also just represent a Handle that sends commands to the actual implementation, similar to how SQL DBs usually write to a journal first before modifying data on disk.

I'm not sure I follow here. There are a lot of ways to implement the storage indeed and some will hide the errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One important property I’d like to uphold is that NetworkBehaviour is about networking stuff, connection handlers care about handling connections, and other things live outside these abstractions — at least in terms of code module structure. Kademlia might require an implementation of RecordStore in its constructor, and it might use the provided functions from the NetworkBehaviour or from the ConnectionHandler, but the actual RecordStore should live outside these two pieces — also regarding resource usage.

One somewhat similar case is bitswap, where our store also is structured like this, but it has a fully synchronous API and is spawned inside a dedicated thread, with communication over async channels. Failures will need to be bubbled up to the network peer, plus logging (with careful tuning of levels — only things like “disk full” warrant a WARN or ERROR, everything else should stay below INFO, or preferably below DEBUG for per-interaction events).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would vote for option 2 (direct integration into the ConnectionHandler). However, it seems like a major change. So, my question is: whether there is any benefit from this PR as the preparation for that refactoring (that unlikely will make it in the nearest working plan)? Otherwise, feel free to reject it.

Copy link
Contributor Author

@shamil-gadelshin shamil-gadelshin Nov 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kademlia might require an implementation of RecordStore in its constructor

Did you mean this one?

pub fn new(id: PeerId, store: TStore) -> Self {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One important property I’d like to uphold is that NetworkBehaviour is about networking stuff, connection handlers care about handling connections, and other things live outside these abstractions — at least in terms of code module structure.

Agreed.

Copy link
Contributor

@thomaseizinger thomaseizinger Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One important property I’d like to uphold is that NetworkBehaviour is about networking stuff, connection handlers care about handling connections, and other things live outside these abstractions — at least in terms of code module structure. Kademlia might require an implementation of RecordStore in its constructor, and it might use the provided functions from the NetworkBehaviour or from the ConnectionHandler, but the actual RecordStore should live outside these two pieces — also regarding resource usage.

The reason I am challenging it is because we pay a very big price for this separation and the gains are hard to quantify for me:

  • There is a lot of message passing between ConnectionHandlers and NetworkBehaviours, only because we can't access a shared resource directly from the handlers.
  • Because of the message passing, we can't use async-await to describe our protocols, despite them being mostly request-response.

The interface to RecordStore could still look something like this:

trait RecordStore: Clone {
	fn get_value(&self, key: Key) -> GetValueFuture;
}

This would force implementations to use channels or other things to communicate with the centralised store (self receiver is not mutable and future doesn't have a lifetime). Thus, we would still uphold the "share memory by communicating" mantra and the connections themselves would only wait for a channel wake-up but not perform any other work.

At the same time though, this RecordStore can be referenced within each ConnectionHandler, the actual network protocol can be expressed in a few LoC and even adding things like timeouts for retrieving something from the store could be added trivially.

@shamil-gadelshin
Copy link
Contributor Author

Overall, this PR is a practical answer to this comment from Max: #3035 (comment)

@mergify
Copy link
Contributor

mergify bot commented Mar 22, 2023

This pull request has merge conflicts. Could you please resolve them @shamil-gadelshin? 🙏

@shamil-gadelshin
Copy link
Contributor Author

I'm closing this PR as obsolete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants